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ABSTRACT 

Based on a completely random effect model, optimal designs were constructed for the estimation of five 
variance components in a model that has both crossed factors and nested factors usually called nested factorial. 
We considered a scenario where the same balanced two stage hierarchical nested design is nested within the 
treatment combinations of a two way crossed classification. Groups of design with the same total sample size 
were generated and for a particular configuration of the variance components, generated designs were 
compared for A-optimality and D-optimality of the information matrix of the maximum likelihood estimators. 
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I. INTRODUCTION 

We are interested in designing experiments for the following sampling scenario. Consider two crossed 
factors A and B, were a levels are sampled from a large population of levels of A and b levels are sampled from 
a large population of levels of B, if within each ab cells there is another factor C with c levels sampled from a 
large population of levels of C and m observations are made at each level of factor C giving a total of abcm 
observations. For a practical example, an experiment was conducted to determine the variation in the porosity of 
flour across batches of production. Flour is made from different varieties of cassava and wheat and the 
experimenter is also interested in variability across varieties of wheat and cassava. Knowing which variation 
source is largest could help to focus quality improvements effort. The resultant model for this experimental 
scenario will have two random crossed factor effect with and interaction effect and a nested factor effect within 
each treatment combination of the crossed factors, together with the random error term making five variance 
component 

Aviles and Pinheiro (2001) provided experimental design scenario for the estimation of fixed effects 
associated with crossed factors and two variance components associated with a nested factor and the random 
error term. The assumption of fixed effect for the crossed factor has been very common with model of such 
from the few published work. Smith and Beverly (1981) introduced the idea of nested factorial, an experimental 
design were some factors appear in factorial relationship others in nested relationship. Split factorials design 
was introduced by Ankenman, Liu, Karr, and Picka (2001). Split factorial is an experimental design which split 
a factorial design into sub experiments, a different nested design is used for each sub experiment but within a 
sub -experiment all design points/treatment combination have the same nested design are some of the few 
published work with an assumption of fixed effect for the crossed factors. In general a linear mixed effect model 
is used for this entire experimental scenario 

Random effect model also known has ANOVA model II has been used to describe several 
experimental situations in literature. Experiments where the primary interest lies in making inferences about the 
variance of the random effect has been demonstrated with the one way model, two way crossed model, nested 
model depending on the setting of the experiment. However whenever the setting of an experiment involves 
both crossed and nested factor it is assumed that the effect due to the crossed factor are fixed and the effect due 
to the nested factor are random. 

We will assume a linear model like that of the assembled design of Aviles and Pinhero (2001) but with 
a random effect for both the crossed factor and the nested factor resulting in a completely random model. 
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IT. LITERATURE REVIEW 

Most of the published work in designs for variance components estimation dates back to the 60's and 
70's and have been restricted to specific models namely, one-way random model, the two-way crossed 
classification random model and the two way nested model. R.L Anderson and many of his co-workers are the 
main contributors to the design area during that period (Anderson 1975, 1981). . For the one way model 
Hammersly (1949), Crump (1954), Anderson& Crump (1967) were some of the earliest authors. Hammersly 

(1949) showed that for a fixed N, the variance Var((T a ) is minimized by allocating an equal number n, of 

Np + N + 1 

observation to each class where n = , since this formula may not yield an integer value, it was 

Np + 2 

suggested that the closest integer value for n be chosen. Crump (1954) and Anderson & Crump (1967) showed 
that for fixed K and N , Var((7^ ) is minimized when n i =n = N/ for all i. The optimal value for a in this 

N(Np + 2) N 



case is given as a, 

N(p + l) + l n 

Other authors are Kussmaul & Anderson (1967), Thompson and Anderson (1975), Herrendofer (1979), 
Murkerjue &Huda (1988), Giovagnoli & Sebastiani (1989), Norell (2006). Norell (2006) studied design effect 
for the one way random model using the information matrix of the maximum likelihood estimators. 

The construction of optimal design for the two way crossed models seems to have been considered first 
by Gaylor (1960). He considered the problem of optimal designs to estimate variance components using the 
fitting constant method of estimation of variance comxponents for the unbalanced data. Bush (1962) and Bush 
and Anderson (1963), HIrotsu(1966), Mostafa (1967) are some of the other contributors to the designing 
experiment using the two way random model. 

Some pioneering articles that address the problem of estimating variance components in a nested 
classification are Bainbridge (1965) Prairie (1962), Prairie and Anderson (1962), Bainbridge (1965), they 
proposed designs that systematically spread the information in the experiment more equally among the variance 
components. Goldsmith and Gaylor (1970) carried out extensive investigation on optimal designs for estimating 
variance components in a completely random nested classification. Delgado (1999) defined a class of 
unbalanced design for estimating variance component in the three stage nested classification using the ANOVA 
method of estimation. 

For the crossed nested model no assumption of a complete random model has been made, such work that 
design experiment for variance component estimation are based on the linear mixed effect model .Beverly 
(1981) , Ankenman, Liu, Karr, and Picka (2001) and Aviles and Pinheiro (2001) are authors that have published 
work. 

III. DESCRIPTION /CHARACTERIZATION OF THE DESIGN 

The class of crossed factor designs with an HND placed at each treatment combination is large and contains 

many designs that are too complex for practical use. This research work will focus on a two wa ^ crosse( ^ factor 
design with the same balanced hierarchical nested design (HND) placed at each treatment combination. The 
work also assumes that the crossed factors and the nested factors are random resulting into a completely random 
model. 

The design is described as follows a = number of levels sampled from a large population of levels of 
A, b — number of levels sampled from a large population of levels of B, c = number of levels of the nested 
factor C within each treatment combination (ab), T = the number of treatment combination (axb), m = number 
of observation in each level of C, n =the number of observations in each treatment combination (mxc), N= abcm 
Figure 1 is an example of a design with two levels of factor A, two levels of factor B, two levels of nested factor 
C and two observations at each level of C. the resultant design has four treatment combinations and a total of 
four observation at each treatment combination. 
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d\b 2 



a 2 b l 



a 1 b 1 



Figure 1 : (a =2, b=2, c=2, m=2) 

3.1 Analysis of Design (Model and Variance Structure) 

The linear random effect model with interaction of the crossed factor used to represent the response in 
the design is written in vector form: 



y = /A + Z x w x + Z 2 w 2 + Z 3 w 3 + Z 4 w 4 + Z 0 w { 



0 



(1) 



is a vector of abcm observations, ju is the overall mean, Z ; is an indicator matrix associated with the ith 
variance component, w t is a vector of normally distributed random effects associated with the ith variance 
component such that W i ~ N(0, <J i I) . The Txn variance covariance matrix of the observations is 



V =Var(y) = a\ Z X Z , + c>\Z 1 Z 2 + a 3 Z 3 Z 3 + a^Z 4 Z 4 + cr^Z^Z^ 



(2) 



Zj and Z 2 are indicator matrices associated with the variance component of factor A and factor B respectively. 
Z[ and Z 2 have as many rows as the total number of observations (abcm) and as many column as the number 
of levels of factor A and B respectively. Z 3 is an indicator matrix associated with the variance components of 
the interaction. Z 3 has as many rows as the number of observation( abcm) and as many column as the number of 
treatment combination T=ab. Z 4 is an indicator matrix associated with the variance components of the nested 
factor C. 



: 0 Z 4, 



(3) 



Z 4t has cm rows and as many columns as the number of levels of factor C used in treatment combination t. 
Since the same structure of nested design is used in each treatment combination, Z 4 has as many rows as the 
total number of observation (abcm) and as many columns as the total number of levels of factor C used in each 
treatment combination multiply by the total number of treatment combinations used (C =cT). z 0 is an identity 
matrix of order abcm. 

We define the Z's has the following kronecker product 

Zf)=\ a ®\ h ®\ c ®\ m Z»Z 1=1 ®I, ®I ®I 

y> u u c ni Q Q a b c m 



Zi=l a 91 b iS>l c 91 m 



I ®J, ®J ®J 

abcm 
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z 2 =i a ®i b ®i c ® i m z 2 z 2 =J a ®\ b ®J c ® J m 

z 3 =\ a ®\ b ®\ c ®\ m z 3 z 3 =i a ®i b ®J c ®J m 
z A =i a ®i b ®i c ®i m z A z A =i a ®i b ®i c ®J m (4) 

3.2 Method of Estimation 

Several estimation methods have been developed for the estimation of variance components in random 
and mixed effect model for both the balanced and the unbalanced data experiments. This work will use the 
maximum likelihood method to estimate the parameters of the linear random effects model. A maximum 
likelihood (ML) estimator is a point on the parameter space where the likelihood function assumes absolute 
maximum. They are generally derived as the solutions of the likelihood equations obtained by equating to zero 
the partial derivatives of the log likelihood function with respect to the parameters of the distribution. 
Unfortunately for linear random effect model, explicit solutions of the ML estimators cannot be obtained. The 
exact sampling variances cannot also be obtained. We can however obtained an asymptotic large sample 
variances and co-variances of the ML estimates of the parameters of our model as the inverse of the Fisher 
information determined by the negative matrix of the second order partial derivative of the log likelihood. The 
information matrix is given as 



In 







a' 





XV l X 0 

0 -tr(V- l Z.Z'V- l Z.Z.) 



i,j = 0 4 

(Ml) 



Where P represents the fixed effects occurring in the model, and since the only fixed effects in the model is the 
general mean (u), the information matrix concerns only the random effects in the model. 



In[a 2 ] = \tr(V-%Zy- x Z Z '.) i,j = 0 4 

2 



The inverse of V is obtained using the results of Henderson and Searle (1979) : 

v- 1 =e-\i a ®i b ®i c ®cj + e-\i a ®i b ®c c ®j m ) + e-\c a ®c b ®j c ®jj + e; l (j a ®c b ®j c ®jj 
+ et ( c a ® j h ® j c ® j m )+e 5 \ j a ® j h ® j c ® j m ) 

Where: 

n 2 n _ 2 2 a _ 2 2 2 Q _ 2 2 2 2 

(7 0 = a e , (7; — (a e + may) , 0 2 — (cr e + mOy +cma a p) t> 3 — (a e + mrjy +cma a p +acmap) , 

q _ , 2 2 2,2^/)_2 2 2 2 2 

C7 4 — (a e + m<jy + cma a p + bcmo a ) , u 5 — (a e + may + cma a p + bcma a + acma ) 

A useful result for computing the information matrix is given by 
tr(y-'Z i Zy-'Z ] Z j ) = sesq(Z'y'Z j )i, j = 0. ... 4. 

Where sesq represents the sum of squares of all elements in the matrix (see Searle et al Pg. 247) 
The information matrix then is 
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2 





















t„/m 



symmetric 



t 

(aP) 
t 



cm 
c 



(aP) z 



tppjacm t aa /bcm 
tpp ab(cm) 2 /0 5 



(M2) 



For 



ee n 2 2 

0 n m 



r 



,t aa = b 2 (cm) : 



1 



— + 

v^ 2 Q] j 



,tpp = a 2 (cm) 2 



V^3 ^5 J 



Q 2 Q 2 Q 2 A 2 
yV 2 t/ 3 U A V 5 j 



,t w =m 



a 2 a 2 a 2 a 2 a 2 

\V\ Vl &3 U A &5 J 



Where 



abcim - 1), v = ab(c - 1), v a0 = (a - \){b - 1), v 0 =b-\,v a = a - 1 



IV. DESIGN ENUMERATION AND GENERATION 

Groups of Designs with a fixed total sample size will be generated and enumerated for comparisons. The total sample 
size (N) was systematically chosen in such a way that the structure of the design generated is balanced. By balanced design, 
we mean that the same balanced two stage hierarchical nested design is placed at each treatment combination of the crossed 
factors, i.e. c and m are equal for all treatment combinations T. these group of designs by our definition constitute the design 
space. In other to obtain these balanced designs for a fixed total sample size N, the following restrictions are placed. 

(1) c>2.i.e. designs where c=l are not sufficient (the degree of freedom of c =0) 

2 2 

(2) m>2. To be able to obtain separate estimates of O ' and <J e 

The two conditions above invariably mean that cm>4 and infact for a balance design, the least sample size that 
can be obtained is 16. Figure... 1..., is the simplest balanced design. We give a set of rules to obtain groups of 
design for a fixed total sample size N. 

(1) Choose ,/V such that iV = T X n , for all non-prime numbers of T (number of treatment combinations) 
and n ( number of observations in each treatment combination) 

(2) For each distinct iV = T x n from (1) split T = axb ,n = cxm, such thata,£>>2 and 
c,m > 2 

(3) From (2), iV = flX^xexm (all sets of four possible factors that generated the sample size) 

(4) Obtain all possible permutations of distinct a,b,C,m from (3) resulting into the total number of 
groups of design for a fixed total sample size. 

For the balanced design, based on the above restrictions the smallest sample feasible is 16. As an illustration 

• For N — 24 , 4 and 6 are factors of 24, also 3 and 8 are also of 24, but it is only the pair 4 and 6 that 
satisfy the condition for non-prime numbers and so satisfy rule (1) above. 3 and 8 are also factors of 
24, but 3 is a prime number, therefore the pair 3 and 8 are not included as possible values of T and N 

24 = 4x6 = 2x2,x2x3, 

• There are four possible permutations for a, b, C, m generated above 
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> 


a 


= 2,b 


= 2,c 


— 2,m 


= 3 


> 


a 


= 2,b 


= 2,c 


= 3,m 


= 2 


> 


a 


= 2,b 


= 3,c 


= 2,m 


= 2 


> 


a 


= 3,b 


= 2,c 


= 2,m 


= 2 



Resulting into four group of designs for N = 24 

• For TV = 64, 

8 x 8 , 4 x 16 are the only set of factors that satisfy rule 1 

. TV = 8x8 = 2x4x2x4 and TV = 4xl6 = 2x2x4x4 iV = 4xl6 = 2x2x2x8 

• (a,b, C, m) generated from 8 X 8 and 4 x 16 have two sets of distinct values. 

. TV = 64 = 2x2x4x4 N = 64 = 2x2x2x8 

• There are six possible permutations of (2,2,4,4) and four possible permutations 
of (2,2, 2,8) generated above 



> 


a 


= 2,b 


= 2,c 


= 4, m 


= 4 


> 


a 


= 2,b 


= 4,c 


= 2,m 


= 4 


> 


a 


= 2,b 


= 4,c 


= 4,m 


= 2 


> 


a 


= 4,b 


= 2,c 


= 2,m 


= 4 


> 


a 


= 4,b 


= 2,c 


= 4,m 


= 2 


> 


a 


= 4,b 


= 4,c 


= 2,m 


= 2 


> 


a 


= 2,b 


= 2,c 


= 2,m 


= 8 


> 


a 


= 2,b 


= 2,c 


= 8,m 


= 2 


> 


a 


= 2,b 


= 8,c 


= 2,m 


= 2 


> 


a 


= S,b 


= 2,c 


= 2,m 


= 2 



Resulting into ten groups of designs ./V = 64, the above steps of condition can be used to obtain possible 
sample sizes and generate corresponding groups of designs. Corresponding groups of design for possible sample 
sizes between 0 and 100(0<N<100) are found on table at the end of the manuscript. 

V. OPTIMALITY 

Designs enumerated in the last section will be compared in terms of their ability to accurately estimate the five variance 
components (based on the sample sizes). Since no closed form analytical expression is available for the variance covariance 
matrix in this linear random effect model, we examine optimality using the asymptotic variance covariance matrix. A design 
from a group of designs with equal sample size is said to be optimal if it minimizes an optimality criterion related to the 
variance- covariance matrix of the parameter estimates. Equivalently we seek the design that maximizes an optimality 
criterion related to the information matrix of the five variance components. Optimal design in a linear random effects model 
depends on the relative size of the true values of the variance components, and we will not be able investigate optimality 
unless an assumption is made on the true values of the variance components. Since optimality for such models is similar to 
that of nonlinear models, we borrow an idea from optimization on theory of nonlinear models and use the local optimality. A 
MATHLAB code was written in the context of the information matrix of section 3.2; in such a way that enumerated design 
for a fixed total sample size can be compared based on any configuration of the true values of the variance components. In 
this paper we present a general result based on a particular configuration of variance components after empirically 
comparing for several values of these components in accordance with the configuration. The D -optimal criterion and A- 
optimal criterion were used. 
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5.1 



Results Discussion 



For the local optimum designs, the parameter space is 

<j 2 =\j], cr 2 , a 2 ^, cr 2 ,, cr 2 ] a) e [0 l] r = e, y, a/3, /3, a 

2 

<J r represents the proportion of individual variance components 

In many 

industrial experiments where this model can be useful, the variance components of the crossed factors is 
expected to be at least as large as that of the nested factors. This work investigated optimal designs in possible 
sample sizes between 0 and 100 for a configuration of the variance components. The configuration of variance 

'a/3 



components used in this work is such that 7 



2 [ 2 2 2 2 2 1 

a y =|cr 2 < 0*< CT 2 /? < aj < <7 2 J 



We empirically compared for the different proportions of variance components based on the configuration 
above. The A- Optimal and D- optimal designs for only one proportional comparison of the five variance 
components are presented in the table below. 



cr 2 =[0.12, 0.13, 0.24, 0.25, 0.26] 



Sample Sizes(N) 


TPD 


D- 


A- 






Optimal 


Optimal 


24 


(4) 


3 


1 


32 


(4) 


3 


1 


36 


(6) 


2 


1 


40 


(4) 


4 


1 


48 


(16) 


12 


13 


54 


(4) 


4 


1 


56 


(4) 


4 


1 


60 


(12) 


12 


9 


64 


(10) 


6 


7 


72 


(28) 


28 


13 


80 


(16) 


15 


1 


84 


(12) 


12 


1 


88 


(4) 


4 


1 


90 


(12) 


7 


1 


96 


(40) 


16 


1 


100 


(6) 


2 


1 



The table above shows the D-optimal designs and the A-Optimal design for different sample sizes. The 
second column indicates the total number of designs generated for each sample size, categorization of design is 
based on the table at the end of the manuscript, for example, N=48 has a total of sixteen (16) candidates designs 
generated and based on the categorization at the end of the table, Design Twelve (D12) is the D-Optimal and 
Design thirteen (D13) is A-Optimal. Likewise N= 72 has a total of twenty eight (28) candidates designs and 
Design Twenty Eight (D28) is D-Optimal and Design Thirteen (D13) is A -Optimal. 

From the above results and from other empirical comparisons made for different proportional value of 
the variance components based on the same configuration, the following general statements can be made for the 
choice of A-Optimal and D-Optimal Design. 

Determinant Criterion 



(1) From the candidates designs generated for a particular sample size, select designs in which the product 
of the levels of the two crossed factors (axb) are largest. 

(2) Reduce the designs selected in (1) to those ones in which the degree of freedom for the interaction 
factor is largest 
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(3) From the remaining designs in (2), since °a p, then in most cases, designs with the number of 
levels of the crossed factor A that is at least as large as the number of levels of the crossed factor B is 
selected, i.e. a>b (except for some smaller sample sizes) 

(4) If there are still more than one design from (3), then in most cases the design with the larger number of 
levels for the crossed factor C is the D-Optimal design 

Trace Criterion 

(1) Select all designs in which the number of observation within each level of the crossed factor C is 
largest, i.e. m is largest 

(2) If there are more than one design in (1), select the one in which the levels of the nested factor C is 
largest, i.e. c is largest. 

(3) If there are still more than one design from (2), the remaining designs should be empirically 
investigated for optimality has the A-Optimality in this case varies with values of the variance 
components and sample sizes. 

VI. CONCLUSION 

We have constructed optimal experimental designs in a linear random effect model with five variance components. 
The model has random nested factors nested within the treatment combination of the crossed factors. We systematically 
generated groups of designs for a fixed total sample size in such a way that designs generated are balanced. Local A-Optimal 
and D-Optimal designs for maximum likelihood estimators were obtained for a particular configuration of the variance 
components after empirically comparing designs for several proportional values of the components. We also presented a 
general result from the comparisons made. Although we have only generated designs for sample sizes between 0 and 100, 
the procedure stated in this paper can be used to generate and compare larger sample sizes, the procedure can also be used to 
compare different configurations of the variance component. Overall for the linear random effect model the D-Optimal 
design is preferred since samples are concentrated at the level where the true values of the variance components are larger. 



Table of Sample sizes with total number of generated designs 



N=24 


N= 


36 












N=32 


N= 


=40 






D 


1 


2 


t 3 


4 


1 




2 _! 


3 


4 


5 ! 


6 


1 


2 


? 


i 


1 


2 


3 


4 


A 


2 








2 


2 




3 




2 




3 T 


2 


~2 


3 i 


3 


2 


2 




2 


4 


2 




2 


2 


5 


b 


2 








2 


3 




2 




2 




3 j 


3 


3 


2 j 


2 


2 


2 




4 


2 


2 




2 


5 


2 


c 


2 








3 


2 




2 




3 




2 ! 


2 


3 


2 ! 


3 


2 


4 




2 


2 


2 




5 


2 


2 


m 


3 








2 


2 




2 




3 




2 i 


3 


2 


3 i 


2 


4 


2 




2 


2 


5 




2 


2 


2 


N = 48 


D 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


15 


16 




A 






2 




2 


3 




3 


2 


2 




4 


4 


2 


2 


3 


4 




2 


2 


2 




6 






b 






3 




3 


2 




2 


4 


4 




2 


2 


2 


2 


4 


3 




2 


2 


6 




2 






c 






2 




4 


2 




4 


2 


3 




2 


3 


3 


4 


2 


2 




2 


6 


2 




2 






m 






4 




2 


4 




2 


3 


2 




3 


2 


4 


3 


2 


2 




6 


2 


2 




2 
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2 
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N = 72 
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1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


11 


12 


13 


14 


15 


16 


17 


18 
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3 
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4 
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2 
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2 


2 




4 




3 
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2 


3 
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3 


4 


3 




2 


3 


2 




3 


4 
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3 






3 
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2 
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